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Abstract. 

As the current fleet of meteorological satellites age, the accuracy of the imagery sensed on a 
spectral channel of the image scanning system is continually and progressively degraded by noise. 
In time, that data may even become unusable. We describe a novel approach to the reconstruction 
of the noisy satellite imagery according to empirical functional relationships that tie the spectral 
channels together. Abductive networks are applied to automatically leant the empirical functional 
relationships between the data sensed on the other spectral channels to calculate the data that 
should have been sensed on the corrupted channel. Using imagery unaffected by noise, it is 
demonstrated that abductive networks correctly predict the noise-free observed data. 

1 Introduction. 

The fleet of four polar orbiting meteorological satellites currently operated by the National 
Atmospheric and Oceanic Administration (NOAA) carries a multi-spectral sensing system for 
imaging the Earth. This system, the Advanced Very High Resolution Radiometer (AVHRR), 
measures irradiances in five narrow spectral bands ranging from the visible to the infrared (IR) 
parts of the electromagnetic spectrum. The system is described in section 2 below. Suffice it to 
say here that by virtue of the high resolution of the instrument, a wealth of data is available. 

It has been noted that one of the five spectral channels of the AVHRR (channel 3) is 
particularly susceptible to noise and its accuracy degrades with age, perhaps to the point where the 
data is unusable (Ref. 1). The possibility also exists that some of the archived AVHRR imagery 
from the older satellites that have been replaced with the current generation of spacecraft may also 
be of questionable quality. 

The problem faced is the use of archived and real-time satellite imagery which may be 
partially corrupted by noise. One approach is to correct the data to its true but a priori unknown 
value. Because the channel is continually and progressively denigrated by noise, any correction 
scheme requires constant maintenance. 


1 SAIC Technical Report SAIC-94/1062 
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An alternative approach, pursued in study described here, is to replace the data measured on 
the noisy channel with data constructed from the other four spectral channels. Our approach relies 
on a technique called abductive networks that automatically discovers the networking between the 
spectral channels that are embedded in the measured data. In this way the noisy satellite imagery 
is reconstructed according to empirical functional relationships that tie the spectral channels 
together. 

Here we describe the application of a proprietary tool for creating abductive networks to the 
modelling of the AVHRR. Specifically, channel 3 is modelled as the output calculated from the 
empirical inputs of the other four spectral channels. Our approach was exercised on imagery 
collected with the AVHRR on NOAA-1 1, which is not as yet seriously compromised by noise. 
The data predicted for channel 3 with the other channels as inputs to the network that was created 
is then statistically compared to the data actually observed. The result was that the network was 
highly successful at simulating the observed output. 

The next section provides a short description of the AVHRR. Section 3 gives an overview of 
abductive technology. Section 4 describes the application of abductive networks to satellite 
imagery with the objective of uncovering the effective relationship between the imagery sensed in 
an intermediate spectral band and the imagery sensed in the neighboring bands. Our conclusions, 
principally that abductive networks show great promise for reconstructing noisy satellite imagery, 
are presented in section 6. 

2 A Brief Description of the AVHRR. 

The AVHRR currently flown aboard the NOAA polar orbiting meteorological satellites is a 
downward-pointing cross-track scanning system. It makes radiometric measurements in five 
spectral channels: two in the visible and adjacent near-infrared (near-IR) pan of the spectrum 
(channels 1 and 2) and three in the IR pan (channels 3, 4, and 5). The spectral band widths, in 
microns (pm), are summarized in Table 1. For NOAA-IO only, the spectral band of channel 4 is 
10.50 - 11.50 pm, and channel 5 output is a repeat of channel 4. The field of view of each 
channel is approximately 1.4 milliradians leading to a nadir resolution of about 1.1 km (for a 
nominal satellite altitude of 833 km). There are 2048 pixels per scan line, where each pixel covers 
about 2 steradians. 


Table 1. Spectral band widths of the AVHRR. 

Channel # Band Width (pm) 

1 0.580 - 0.680 

2 0.725 - 1.100 

3 3.550 - 3.930 

4 10.300 -11.300 

5 11.500 -12.500 
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The analogue data output from the sensors is digitized on board the satellite. The IR channels 
are calibrated in flight using a view of a stable black body and space as a reference. No in flight 
visible channel calibration is performed, although the space view is available as a reference point. 

The radiometer data collected by channel 3 of each NOAA satellite have been very noisy due 
to sensor problems and may be eventually unusable (Ref. 1). This is especially true when the 
satellite is in daylight. (Of course, channels 1 and 2 are blank for nocturnal views.) 

The normal operating mode of the AVHRR scanning system is to capture a scan line in a 
buffer and continuously broadcast the digital data in a wide beam aimed at the Earth. The direct 
transmission mode is called High Resolution Picture Transmission (HRPT). Ground processing 
of the HRPT consists of its calibration, earth location, and breakout of the individual sensor 
channels. 

3 An Introduction to Abductory Induction and Abductive Networks. 

Abductive reasoning, or abduction, is defined as the process of reasoning under conditions 
of uncertainty from general principles and initial facts to new facts (Ref. 2). Abduction differs 
from deduction, in which all principles and facts are assumed to be known with complete or 
assumed certainty. 

Induction is the process of reasoning from specific facts to general principles. This 
reasoning process is handles the many real-world situations that are rich in empirical data but lack 
sufficient conceptual understanding to unify that data into a coherent, accurate view of the world. 
Ideally, the facts supplied to an inductive argument are known with absolute certainty. In the real 
world, however, the facts are contaminated with uncertainties. Uncertainty arises, for example, 
from imprecise, unreliable or incomplete information. Even with indisputable information, 
uncertainty arises due to a lack of complete and thorough knowledge and understanding of the 
situation. Then the generalities inductively reasoned from those facts must themselves be 
uncertain. As a result, the reasoning itself contains uncertainty. Abduction is the reasoning 
process that incorporates this realistic view of uncertainty. 

A practical implementation of abductive reasoning uses numeric functions and measures, 
called abductive measures, to convey the inherent importance of a single fact or piece of 
information (Ref. 2). Abductive measures represent relationships between facts. These should 
be viewed as working, rather than ‘true’, relationships in the sense that they predict nature 
correctly even if for the wrong reasons. 

Abductive measures are used to decompose complex problems into subproblems in a process 
called chunking. Here a limited number of facts, or types of facts, are dealt with at a time. They 
are summarized in terms of single abductive measures. The chunks are then united by 
appropriately combining their respective abductive measures. 

Abductory induction is the process of creating general principles from databases of empirical 
observations. Abductory induction is applied to create an abductory model of the process 
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described implicitly by the database by formulating, or at least approximating, the relationship 
between the database variables in terms of the contained data. The abductory model is most 
conveniently posed as an unstructured network or a cascade of mathematical equations. This 
adaptability property makes abductory induction particularly well suited to unsupervised machine 
learning. The problem immediately posed is determining the appropriate functions, and hence the 
layout of the network, out of an infinite number of candidates, that best describes the data. 
Assuming a model structure, as is done in regression techniques, may result in poor fits to the 
data just because that specific structure is not present. An alternative is to model the data with a 
very general multinomial. Within very broad assumptions any arbitrary function may be 
approximated by a polynomial (i.e., a truncated Taylor series); the accuracy of the approximation 
is directly related to the number of terms retained, that is, the degree of the polynomial. However 
the many coefficients needed for even a small set of variables makes this approach intractable for 
degrees much above 2. 

A practical solution to modelling the database in terms of its uncertain structure is to apply the 
chunking concept and split the input variables among several groups. The groups are collectively 
input to the individual nodes of an incipient network and the relations among them are 
summarized in terms of an abductive measure. These results are then passed on to the next layer 
of the evolving network. The labor is substantially reduced because only the model associated 
with a single node must be determined at any single time. 

Abductive networks are networks of functional nodes (Ref. 3). Neural networks may be 
considered a special class of biologically-motivated abductive networks. Incorporating the 
chunking concept, a very effective algorithm for creating abductive networks utilizes polynomial 
equations (of moderate order) for the abductive measures. Given a database of example situations 
about a problem consisting of a representative set of inputs and outputs, an abductive network can 
be used to fit the best polynomials relating the variables, node by node, cascading layer to layer. 
Specifically, inputs to each node are processed and output, along with the original input variables 
to the nodes in subsequent layers of the network. The result is a compact representation of the 
interactions between the variables as evidenced in the massive amount of empirical data. 

The Abductory Induction Mechanism (AIM™) is proprietary software of AbTech 
Corporation for implementing abductory induction for the automatic and unsupervised creation of 
abductive networks (Ref. 4). The network created by AIM is a robust and efficient representation 
of the relationships existing between the variables contained in the database. AIM uses 
polynomials of up to degree 3; the polynomials contain cross-terms to allow interaction between 
node inputs. Not all terms may be included in specific nodal polynomials because AIM, in a 
process called carving, neglects terms which do not contribute significantly. The network size, 
chunking and connectivity (between chunks and/or inputs), and coefficient values are all 
determined automatically by AIM. Networks are created from layer to layer until the network 
model ceases to be improved according to a modeling criterion. The criterion assures that as 
accurate a network as possible is created without overfitting the data (that is, tailoring the network 
specifically to the supplied database). 
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4 An Application of AIM to the AVHRR Calibrated Channel Output. 

It was mentioned above that, for the older satellites in the NOAA fleet, channel 3 is very 
noisy, to the point of being unusable without significant suppression of the noise effects. The 
objective is to reconstruct channel 3 from the other four spectral channels. 

The AVHRR instrument scans the scene pixel by pixel in all five spectral channels 
simultaneously. Of course, the sensors will make different irradiance measurements in the 
different spectral bands. However, in a pixel, excluding any possible misalignment among the 
five fields of view, the channel 3 irradiance must be related to the irradiances measured on the 
other spectral channels. That relation is a complex problem in radiative transfer for both solar and 
terrestrial photons. An alternative to a possibly intractable theoretical analysis is to use a satellite 
imagery database consisting of AVHRR calibrated output to uncover empirical relationships 
between channel 3 and the other channels contained in the data. 

The SAIC satellite ground station at received imagery from the NOAA- 1 1 satellite for a pass 
over the eastern United States on 25 February 1994 around 2139 UTC (16:39 EST). NOAA-1 1 
was launched September 24, 1988; as the second oldest satellite in current operation, it is two 
years older than NOAA- 10 and nearly four years older than NOAA-9. The AVHRR on NOAA- 
1 1 has not yet evidenced severe denigration of any of its spectral irradiance measurements. The 
downlinked data was calibrated, rectified, and broken out into its individual channels, which were 
separately saved to file. The satellite image contained in excess of 1000 scan lines. A 500 line by 
500 pixel box was extracted from the southwest comer of the image and sampled for every other 
scan line, so that the channel databases each contained 250 scan lines nominally separated by 2.2 
km in the direction of the satellite track. The data for each individual channel were then ordered 
by pixel, for a total of 125000 pixels, in a single file for each of the five channels. Each pixel is 
considered as an individual observation containing five values, one for each of the spectral 
channels. 

The AIM software package was applied to the image box. Memory limitations in AIM 
prevented use of the entire database because AIM is limited to only 8000 observations. As a 
result, four 8000-pixel strips were extracted from the image box. Specifically, the extracted 
image box was divided into four sections in the along-track direction. Each section of the image 
box was sampled in blocks of 8000 pixels such that each block contains sixteen sequential 500- 
pixel neighboring scan lines; the blocks are spatially coherent. A fifth block was created by 
assembling four adjacent scan lines from each of the four image strip. Note that while the 
quarters of this image block are spatially coherent, the quarters are spatially decoupled from each 
other. 

Channels 1, 2, 4, and 5 were designated as the network inputs and channel 3 was designated 
as the network output. Individual networks were created for each of the five image blocks. The 
networks were created on a Macintosh SE/30. The time required for forming the network 
obviously depended on network complexity. Creation times ranged from about 20 minutes up to 
nearly an hour. Each network was then evaluated against both its own creating image block and 
the other four image blocks. The evaluation consisted of statistically comparing the channel 3 
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output predicted by the network to the channel 3 data actually observed. 

All of the networks perfonned well on self-evaluation. Generally the networks degraded 
with spatial distance of the evaluating image block from the network-forming image block, that is, 
with progressive spatial decoupling between the image blocks. The exception was the AIM 
network created with image block 5 (the four-strip composite through the 500 line by 500 pixel 
image box). That network, presented in figure 1, generally outperformed all the other networks, 
except for their own self-evaluation. 

As can be seen in figure 1, the network created with image block 5 is a four-layer network of 
feed-forward elements, that is, the network cascades from the raw input variables on the left to the 
single output variable on the right. The inputs are the calibrated AVHRR data from channels 1, 2, 
4, and 5. Note however that only channels 1, 4, and 5, referred to as chi, ch4, and ch5, 
respectively, were used. The final output is the network-predicted (calibrated) response for 
AVHRR channel 3, referred to as ch3. Channel 2 was carved from the inputs because of its 
partial redundancy, most likely with channel 1 . The numbers and types of network elements, the 
element polynomial functions, and their connectivity are learned abductively (induction under 
uncertainty). The coefficients of the element functions are determined by multiple linear 
regression of terms up to power three. The structure of the network is determined according to a 
set of rules and heuristics that are an inherent part of the AIM network creation strategy. The best 
network, in terms of its structure, element types, coefficients, and connectivity, is found 
automatically by minimizing a modeling criterion that seeks the most accurate network possible 
within acceptable tolerance (this avoids creating a network tailored to only the training data). 

In figure 1, the open circles following the inputs are ‘normalizers’. They transform the the 
original input variables to standard variables with zero mean and unit variance. This assures that 
all input variables will be fairly represented in the network. The boxes labelled double and triple 
are elements whose name is based on the number of inputs from the previous layer. These 
elements are described by fairly general third-order polynomials. Doubles and triples may have 
some significant explicit cross-product terms, allowing interaction among the node input 
variables. Note that the output of any given element can feed subsequent layers as can the original 
variables. The open circle preceding the network output (ch3) is a so-called unitizer . A unitizer 
converts the standardized range of the intermediate network ouput to the units of the output 
variable used to create the network; it is an inverse normalizer. 

Figure 2 plots the observed output of channel 3 in image block 5 against the output predicted 
for channel 3 using the data measured on the other channels in block 5 as input to the network 
created with block 5. This is a self-evaluation of the network created with image block 5. The 
line with unit positive slope indicates perfect correlation between the observed and the predicted 
channel 3 output. The overwhelming bulk of the 8000 network-predicted channel 3 values 
straddle the line, indicating the high quality of the network fit to the observations. The correlated 
data appear to group predominantly into two large clusters hugging the unit line (the upslope 
cluster being the more massive of the two). Apparently the observed channel 3 data is inherently 
bimodal; this bimodal distribution is captured in the network predictions. Figure 3 displays the 
normalized errors for this self evaluation of the block 5 network. Normalized error is defined as 
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the difference between the observed and predicted values for the channel 3, normalized by the 
observed value. As in the previous figure, the normalized errors group predominantly into two 
large clusters hugging the zero-error line. The larger of the two clusters sits over the mean of the 
channel 3 observations for block 5 (2324.3). The normalized errors are mainly within ±5%, and 
nearly evenly dispersed around the zero-error line. As the observed values depart from the block 
mean, error grows, implying that the network performance degrades. Even at its worst, 
however, the normalized error is mostly within about 15%. 

Figure 4 plots the observed output of channel 3 in image block 3 against the output predicted 
for channel 3 using the data measured on the other channels in block 3 as input to the network 
created with block 5. The perfect correlation line with unit positive slope is displayed for 
comparison. Here, the agreement between the predicted and the observed values is excellent, as 
evidenced by the near perfect collapse onto the 45° line. The normalized errors between the block 
5 network predictions of channel 3 for block 3 and the actual block 3 observations are shown in 
figure 5. The normalized errors are strongly clustered about the actual block mean of 2316.9, and 
are about 1%. Note that in general the network tends to very slightly overpredict the channel 3 
output. The strong clustering of the normalized errors reflects the shorter range and tighter 
clustering of the block 3 data about its mean (standard deviation = 27.0 ~ 1.2% of the mean). 

5 Conclusions. 

We have demonstrated that that abductive networks are very successful in modelling the 
measurements collected with the AVHRR in our specific test case. The abductive networks 
created with AIM create reliable and compact representations of the AVHRR spectral channels in 
terms of diagnosing the empirical relationship between channel 3 and the other four spectral 
channels. The network trained with the composite database selectively extracted from the imagery 
so as to have only partial spatial coherence generally outperformed the networks trained with 
spatially coherent databases, except perhaps for the self-evaluation. This indicates some near- 
universality exists in the relationship between the channels, which may be found by the 
appropriate sampling of the satellite imagery. The general use of abductive networks for 
modelling the AVHRR towards reconstructing the noisy data collected on its channel 3 shows 
great promise. 

Another possible use for abductive networks is as a quality-control monitor. Specifically, the 
real-time degradation of channel 3 can be measured by periodically comparing its observations to 
network predictions. For example, the channel may be considered corrupted if the average error, 
say, between the observed and the predicted channel 3 output through an image (or a piece of an 
image) exceeds some established threshold. 
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Figure 1. The AIM abductive network created with image block 5 (the composite block 
through the image box) showing the numbers and types of network elements. 
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Figure 2. Observed channel 3 output in image block 5 vs. the predicted output using 
the other block 5 channels as inputs to the network created with image block 3. 
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Figure 3. Normalized channel 3 output errors in image block S, between the observed 
output and the output predicted from the network created with image block S. 
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